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In this paper, the combined effects of image gathering, sampling, and reconstruction 
are analyzed in terms of image fidelity. The analysis is based upon a standard end-to-end 
linear system model which is sufficiently general so that the results apply to most line-scan 
and sensor-array imaging systems. Shift-variant sampling effects are accounted for with 
an expected value analysis based upon the use of a fixed deterministic input scene which 
is randomly shifted (mathematically) relative to the sampling grid. This random sample- 
scene phase approach has been used successfully by the author and associates in several 
previous related papers [1]— [4] . 

Formulation 

The end-to-end linear model upon which the results of this paper are based is charac- 
terized by three independent system components, an input scene /(x,y), an image gath- 
ering point spread function h(x,y ), and an image reconstruction point spread function 
r(x,y). All three of these components are referenced to a common orthogonal spatial 
coordinate system ( x,y ) normalized so that the sampling interval in both directions is 
unity. That is, sampling occurs at the integer coordinates (m, n). Because of this normal- 
izing convention, when the model is analyzed in the Fourier domain, the associated spatial 
frequencies (/i, u) have units of cycles/pixel and the Nyquist (folding) frequency is 0.5. 

For notational convenience two other components are introduced, the pre-sampling 
image y(x,y), and the reconstructed image f'(x,y). The end-to-end model that relates 
the input scene / to the output reconstructed image f is then 

*h sample *r 

f(x,y) — * g(x,y) — ♦ g(m,n) — > f'(x,y) 

where the * operator denotes 2-d spatial convolution and g(m,n) is g(x,y) sampling onto 
the pixel grid. This model is the basis for all the analysis that follows, and, consequently, 
the results of this paper are applicable to the fidelity analysis of any sampled imaging 
system whose performance is characterized by the equation 

f'(x,y)= [[f(x,y)*h(x,y)]comb(x,y)] * r(x, y) (2a) 

where 

comb(x, y ) = 6(x — m, y — n) (2 b) 

m n 

is the conventional 2-d comb function which accounts for sampling. 
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Image Fidelity 

A variety of metrics have been advocated to measure how well one image matches 
another. These metrics include the 1-norm 

ll/-0||i=/ [ \f( x ,y) - g(x,y)\dxdy 
J x Jy 

the common RMS 2-norm 

11/ - g\\ = (J J y l/( x > y) ~ #( x > y) I 2 dx d y^j 

which generalizes for p ^ 2 to the p-norm 

11/ - g\\p = (^J j I /(*» y) - g(x, y) \ p dx dy^j 

and which approaches the oo-norm 

11/ - <?||oo = max I f(x,y) - g(x,y ) | 

*,y 

in the limit as p — > oo. Of these, the RMS norm ||/ — p|| is far and away the most common, 
presumably because it lends itself so well to mathematical analysis. 

The RMS norm squared 


ll/-5l| 2 =/ f\ f(x,y) ~ g(x,y)\ 2 dxdy (3) 

J x J y 

is a measure of image fidelity [5]. Specifically, the conventional definition of fidelity is 

fidelity = 1 - (4) 

The primary purpose of this paper is to illustrate how the method of sample-scene phase 
averaging can be used to derive expressions for the three fundamental “fidelity loss” metrics 

||/-p|| 2 and -/'|| 2 and ||/-/'|| 2 . 

The first of these metrics is a measure of image blur , the common loss of high spatial 
frequencies caused, for example, by defocus [5]. The second is sampling and reconstruction 
blur , the loss of fidelity caused by sampling (aliasing) and imperfect reconstruction [1], The 
third, and most important, metric is the end-to-end blur , the net loss of fidelity caused 
by the combined effects of image gathering, sampling and reconstruction [6], [7]. Each of 
these fidelity loss terms will be analyzed in order, beginning with image blur. 
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Image Blur 


The conventional continuous-continuous model of image formation (image gathering) 
is that the process is both linear and shift-invariant. That is, / and g are related by a 
convolution as 


g(*,y) 


i x' f v 1 


h(x -x',y - y')f(x', y')dx' dy' 


(5a) 


where h(x, y) is the image gathering point spread function (PSF) conventionally normalized 
so that 


J I J u 


h(x, y)dx dy — 1 . 


(56) 


This model is much more 
domain as 


easily understood when expressed in the spatial frequency (/r, v) 


y(/i,i/) = h(n,v)f(y,v) 


(6a) 


where 


g(v,n) = / g(x,y)exp(- 2 Tri(xn + yv))dxdy 

J x Jy 


( 66 ) 


is the Fourier transform of g and the transforms h, / are defined analogously. 

It is well known that the PSF h typically acts as a low-pass filter. As a result, g is a 
blurred copy of / and the extent of this image blur is 


ll/-^H 2 =/ / l/(z,y) ~ g(x,y)\ 2 dxdy (7a) 

Jx Jy 

which can be rewritten, using the energy (Parseval’s) theorem, as 

ll/-^l| 2 =/ [ l/(/b*') - g(v,v)\ 2 dydu. ( 76 ) 

J [l J V 

However, from equation (6a), this last equation can be written as 

Il/-ff|| 2 = / / |1 - M/b^)| 2 |/(/bi')| 2 <WiA (7c) 

J [L J V 

Note that if some metric other than the || ■ || 2 norm were used, the energy theorem would 
not be applicable and the corresponding easy transition from a spatial domain integral 
to a corresponding frequency domain integral would not be possible. As the following 
discussion illustrates, this easy transition is a powerful argument in favor of the squared 
RMS metric. That is, the insight provided by equation (7c) is profound. 

• Both terms in the integral are non-negative. Therefore, 

11/ ~ y|| 2 = 0 <=> |1 - h(y,u)\ 2 \f(^,u)\ 2 = 0 for all (//, u). 
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• Image blur is significant <=$■ the scene has significant energy at spatial 

frequencies (p,n) where the optical transfer function (OTF) h(g, v) is significantly 
different from 1. 

• Although the scene energy tends to decrease rapidly with increasing spatial frequency, 
most “natural” scenes have energy at all spatial frequencies. That is, natural scenes 
are not band-limited. 

• The OTF typically decreases smoothly in magnitude from 1 at low spatial frequencies 
to 0 at high frequencies. Thus image blur is caused by a suppression of moderate to 
high spatial frequencies. 

All these observation are well-known. However, the point is that they follow immedi- 
ately by inspection of the frequency domain integral equation for ||/ — g\\ 2 . This observation 
is the motivation for a search to find analogous equations for \\g — /' || 2 and ||/ — /'|| 2 . 

Sampling 

The conventional continuous- discrete- continuous (end-to-end) model of image gath- 
ering, sampling and reconstruction is the convolution equation 

= (8a) 

m n 


where f is the (continuous) reconstructed image and (as before) g = h* f. The (discrete- 
to-continuous) reconstruction process is conventionally assumed to be both linear and 
shift-invariant. It is therefore completely characterized by the reconstruction point spread 
function r conventionally normalized so that 


r(x,y)dy dy = 1. (86) 

This PSF can be thought of as the (continuous) output corresponding to a (discrete) 
sampled input which is 1 at the origin (m = n = 0) of the sampling grid and 0 at all 
other grid points. The reconstruction function is a low-pass filter which accounts for the 
combined effects of all post-sampling operations such as resampling and display. 

The (continuous-to-discrete) sampling process is linear. However, 
sampling is not a shift-invariant process. 

That is, sampling causes the end-to-end system to be shift -variant. This sample-scene 
phase dependence complicates the end-to-end analysis significantly. For example, the end- 
to-end fidelity loss expression that one would write by analogy with equation (7c) is 

\\f - f'W 2 ^ f [ \l-Hv,v)r(v,v)\ 2 \f(n,v)\ 2 dvdis. 

J [L J V 
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However (except in special cases) this equation is not correct. 

Although the end-to-end model is not shift-invariant, it can be demonstrated that by 
using sample-scene phase averaging the metrics ||/ — /' || 2 and \\g — f'\\ 2 can be written as 


11/ _ /'II 2 = / f [non-negative] | /(//, v ) | 2 dgdu 
J fl J V 

(9a) 

\\d ~ f'W 2 = / / [non-negative] \g(fi,u )\ 2 dfidu. 
J fl J If 

(96) 


Sample-Scene Phase Averaging 

As first established in references [1] and [2], sample-scene phase averaging consists of 
the following steps. 

• Fix the sampling grid. 

• Shift the scene a random amount (u,v) relative to the fixed sampling grid 

f(x,y) - u,y - v). 

• Calculate (in the frequency domain) the corresponding shifted pre-sampling image 

9(x,y) -*g(x-u,y- v) 

and reconstructed image 

f'( x i y) -> u,v). 

• Assume that the random u and v shifts are independently and uniformly distributed 
between 0 and 1. 

• Calculate (in the frequency domain) the expected values 

E [||/ -/'ll 2 ] = f' f \\f - ff dudv 
Jo Jo 

and 

E [110 ~ /'ll 2 ] = / / \\g - ff dudv. 

Jo Jo 

• Observe that the image blur is independent of the sample-scene phase so that 

£[ll/-sll 2 ] =ll/-9ll 2 . 


The results of this process are expected value equations consistent with (9a) and (9b). 


Sampling and Reconstruction Blur 

By using sample-scene phase averaging, it can be shown that 


E[\\g ~ f'f]= f f |1 -r(g,u)\ 2 + 

J u J V _ _ 

“ m n 


\g(g,u)\ 2 dgdu (10) 


where the double summation is over all (m,n) ^ (0,0). However, an algebraically equiv- 
alent representation provides more insight into the fidelity loss associated with sampling 
and reconstruction. That is 


E [||S - /'ll 2 ] =<?,+* 


(Ho) 


where 


and 


= / / 5Z5Z - ”)? 

J U J V _ „ 

^ m n 


| r((i, u)\ 2 dg du 


= / f I 1 ~ r(n,u)\ 2 \g(n,u)\ 2 dfidu. 
J n J v 


(Hi) 


(lie) 


These two terms can be interpreted as follows [1]. 

• The term accounts for aliasing caused by undersampling; it measures the loss of 
fidelity caused by the folding of significant image energy | g(g, u)\ 2 beyond the Nyquist 
frequency into those (low) frequencies where the reconstruction filter response r(g, u) 
is not 0. Moreover 



- n )\ 2 

_ m n 


\r(n,v)\ 2 = 0 


for all (p, u). 


• The term e 2 accounts for imperfect reconstruction; it measures the loss of fidelity 
caused by the presence of significant image energy at those (high) frequencies where 
r(g,u) is not 1. Moreover 

4 = 0 <=> |1 - f(g,u)\ 2 \g(g,u)\ 2 = 0 for all (/z, v). 

• If it were possible to produce a truly band-limited and sufficiently sampled image g, 
and if the reconstruction function was then taken to be r(x,y) = sinc(x)sinc(y) then 
these two terms would be 0. (This is the sampling theorem.) 

End-To-End Blur 

In a similar manner, by using sample-scene phase averaging it can be shown that 

E [11/ -/'ll 2 ] is 

[non-negative] |/(p, i/)| 2 dg du (12a) 
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where the [non- negative] term is 


|1 - /(p, v)r{n, ^)| 2 + IM/b v)\ 2 Y.Y1 \f(li-m,v- n )f 


( 12 *) 


and again the summation is over all (m,n) ^ (0,0). Also, as before, an algebraically 
equivalent representation provides more insight into the end-to-end fidelity loss. That is 


E [11/ -/'ll 2 ] = S,+'l 

where e 2 is the sampling (aliasing) term defined previously and 

,2 


n i v 


h(ii,v)f(n,v)\ |/(p,i')| d^idu. 


(13 a) 


(136) 


This new term can be interpreted as follows. 


• It accounts for the end-to-end loss of fidelity caused by significant scene energy at (mid 
to high) frequencies where the cascaded response, 6(p, u)r(n , v) is not 1. Moreover, 




|1 - h(p, i/)r(/i, i')| 2 |/(p, u)\ 2 = 0 for all (p, v). 


• It measures how well the reconstruction filter f is able to “deblur’’ (restore) those 
spatial frequencies which were suppressed prior to sampling by the image gathering 
OTF h. 

There is an inevitable trade-off here. For a fixed scene / and sampling grid, any 
attempt to decrease e 2 by modifying h and r will result in an increase in and conversely. 

Fidelity Loss Budget 

All of the previous analysis can be summarized in a fidelity loss budget given by the 
three sample-scene phase averaged metrics 


where 


= / [ |i - M/v 

J n J 1/ 

= / / 

J H J v 

= [ [ IM/*>* / )l 2 

J fit. J i/ 

= f / I 1 - Hd: 

J fiL J V 


£[ll/-<?ll 2 ] 

(14a) 

E[h-f\\ 2 ] =e 2 s+ e 2 r 

(146) 

E[\\f-f\\ 2 ]=e 2 3 + e 2 e 

(14c) 

^)| 2 \ f(fJi,v)\ 2 d/idv 

(14 d) 

~ m ’ v ~ n )i 2 

m n 

\f(n,v)\ 2 dudv (14e) 

|1 - r(n,v)\ 2 |/(p,i/)| 2 dp dv 

(14/) 

v ) f (^ v )\ 2 \f{v,v)\ 2 d[idv. 

(145) 
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The four e 2 terms can be easily calculated via numerical integration. All that is required is 
a knowledge of the scene energy | f(p, u ) | 2 , image gathering OTF h(p, u) and reconstruction 
filter r(p, u ) — and ready access to a computer with a fast CPU and sufficient memory. 

The four e 2 terms are all interrelated and any attempt to minimize one must be 
carefully weighted against the potential increase of the others. Trade-off studies like this 
are the stuff of digital imaging system design. 
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